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Abstract 



We show that it is decidable in exponential time whether the lex- 
icographic ordering of a context-free language is scattered, or a well- 
ordering. 



1 Introduction 

When the alphabet ^ of a language L C ^* is hnearly ordered, L may be 
equipped with the lexicographic order turning L into a hnearly ordered set. 
Every countable linear ordering may be represented as the lexicographic 
ordering of a language (over the two- letter alphabet). A (deterministic) 
context-free linear order is a linear ordering that can be represented as the 
lexicographic ordering of a (deterministic) context-free language. The study 
of context-free linear orderings has been initiated in [3]. In [4], it was shown 
that a well-ordering is deterministic context-free (or equivalently, definable 
by an algebraic recursion scheme) iff its order type is less than u:'^" . Then, 
in [5] it was shown that the Hausdorff rank of any deterministic context-free 
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linear ordering is less than co^ . For an extension of these results to linear 
orderings definable by higher order recursion schemes we refer to |2]. 

Any monadic second-order definable property is decidable for deterministic 
context-free linear orders (given by LR(1) grammars, say). This follows 
form a general decidability result for graphs in the pushdown hierarchy [6], 
more exactly from the "uniform version" of this result. In particular, it 
is decidable whether a deterministic context-free linear ordering is dense, 
or scattered, or a well-ordering. The results of [H [5] implicitely give rise to 
practical algorithms for deterministic context-free languages. In contrast, as 
shown in [9] , it is undecidable for a context-free linear ordering whether it is 
dense. The main results of this paper show that on the contrary, there is an 
exponential time algorithm to decide whether a context-free linear ordering 
is scattered, or a well-ordering. The fact that these properties are decidable 
for context-free linear orderings was first announced in 

2 Linear orderings 

In this paper, by a linear ordering L = (L, <) we shall mean a countable lin- 
ear ordering. We will use standard terminology as in [11] . The isomorphism 
class of a linear ordering is its order-type. 

A linear ordering L is dense if it has at least two elements and for all x,y G L, 
ii X < y then there is some z with x < z < y. Up to isomorphism there are 
four (countable) dense linear orderings, the ordering of the rationals whose 
order-type is denoted rj, possibly endowed with a least or greatest element, 
or both. A scattered linear order is a linear ordering that has no dense 
sub-order. A well- ordering is a linear ordering that has no sub-ordering 
isomorphic to the ordered set of the negative integers. Every well-ordering 
is scattered. 

A linear ordering is quasi-dense if it is not scattered. It is well-known that 
any scattered sum or finite union of scattered linear orderings is scattered. 
Thus, if / is a scattered linear ordering and for each i G /, Lj is a scattered 
linear ordering, then so is ^jg/^j. Moreover, if a linear ordering L is the 
finite union of sub-orderings Li, i = 1, . . . ,n, then L is quasi-dense iff at 
least one the L, is quasi-dense. 

Suppose that A is an alphabet whose letters are ordered by ai < . . . < a^. 
Then we define the strict order <s on the set of words A* hy u <s f iff 
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u = xaiy and v = xajy for some x,y,y' G A* and letters Oj and aj with 
Oj < ttj. The prefix order is defined by it <p f iff u is a proper prefix of v. 
The strict order and the prefix order are partial orders. The lexicographic 
order <£ is the union of the two, so that x <£ y iS x <s y oi x <p y. Clearly, 
{A*, is a linear ordering. 

If L C A* then (L, <£) is a linear ordering, called the lexicographic ordering 
of L. We call L dense, scattered or well-ordered if (L, <£) has the appro- 
priate property. When L is a (deterministic) context-free language, we call 
(L, <^), and sometimes any linear ordering isomorphic to {L,<() a (deter- 
ministic) context-free linear ordering. Every (deterministic) context-free lin- 
ear ordering is isomorphic to the lexicographic ordering of a (deterministic) 
context-free language over the alphabet {0, 1}, ordered by < 1. Indeed, 
when L Q A* and A has k letters ai < ... < a^, say, then we may encode 
each letter with a binary word h{ai) of length [logfc] over {0, 1} so that 
h{ai) <i h{aj) whenever < aj, then (L, <^) is isomorphic to {h{L), <(). 

When L C {0, 1}* then T{L) is the binary tree whose vertices are the words 
in the prefix closure of L. T{L) is nonempty if L is nonempty. A vertex y 
is a descendant of vertex a; if a; is a prefix of y. The following fact is quite 
standard: 

Proposition 2.1 Suppose that L C {0, 1}* and consider the corresponding 
tree T{L). Then L is quasi-dense iff the full binary tree has an embedding 
in T{L). 

Proof. Let Lq be the regular prefix language (00+ll)*01 whose lexicographic 
ordering has order type rj, and consider the tree T{Lq). If the full binary 
tree embeds in T{L), then so docs T{Lq). Consider an embedding of T{Lq) 
in T{L) which maps each vertex x of T{Lq) to a vertex h{x) of T{L). For 
each leaf x of T{Lq) select a leaf Vx of T{L) which is a descendant of h{x). 
The words Vx form a dense subset of L with respect to the lexicographic 
order. (Note also that any two words Vx are actually related by the strict 
order.) 

For the reverse direction, suppose that L is quasi-dense. Let us color a 
vertex x of T[L) blue if x G L. Call a vertex x of T{L) appropriate if the 
blue vertices of the subtree Tx rooted at x form a quasi-dense linear ordering 
with respect to the lexicographic order. If x is appropriate, then it has at 
least two proper descendants y and z which are appropriate vertices with 
y <s z. Indeed, x has a proper descendant x' such that both the set of blue 
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vertices y' of with y' <s x' and the set of blue vertices z' of with 
x' <£ z' form quasi-dense linear orderings with respect to the lexicographic 
order. Suppose that x' = xu, where n is a nonempty word. Then one of the 
vertices xvO where vl is a prefix of u is appropriate, as is one of the vertices 
xvl and x', where xvQ is a prefix of x. Let y and z be these vertices. 

Thus, starting from the root of T{L) , we can construct a set V of appropriate 
vertices such that each x E V has two (proper) descendants y and z in V 
with y <s z. The vertices in V determine an embedding of the full binary 
tree in T{L). □ 

Proposition 2.2 Suppose that L,L' C {0,1}*. If{L,<£) and {L',<£) are 
both scattered, then so is {LL', <^). 

Proof. We will prove that if {LL' , <£) is quasi-dense, then one of (L, Kg) 
and {L' , is quasi-dense. Assuming that LL' is quasi-dense, T(LL') has 
an embedded copy Tq of the full binary tree. Let us color a vertex u of 
T{LL') blue if uv e L for some v € {0, 1}*, i.e., when u has a descendant 
in L. There arc two cases to consider, either each subtree of Tq contains a 
blue vertex, or there is a subtree of Tq having no blue vertex. 

Case 1. Suppose that each subtree of Tq contains a blue vertex. Then each 
vertex of Tq is colored blue, so that L is quasi-dense. 

Case 2. Suppose that Tq contains a subtree having no blue vertex. Let Ti 
denote such a subtree and let u denote the root of Ti. Let UQ,...,Uk be all 
the (proper) prefixes of u that are in L. Now let us color each vertex x of 
Ti with the set of all integers z, < z < fc, such that x has a descendant in 
T{LL') which is a word in UjL'. Then each vertex x of Ti is labeled by a 
nonempty subset of the set {0, . . . ,k}, and if x' is a descendant of a; in Ti, 
then the label of x' is included in the label of x. Let iJ be a minimal set 
that appears as the label of a vertex v ofTi. Then all descendants of v in Ti 
are labeled H. Thus, if i € i?, then the full binary tree embeds in T{uiL') 
and thus in T{L'), so that L' is quasi-dense. □ 

3 Scattered context-free linear orderings 

In this section, we assume that G = {N, {0, 1}, P, S) is a context-free gram- 
mar with nonterminal alphabet A^, terminal alphabet {0, 1}, rules P and 
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start symbol S that contains no useless nonterminals or e-rules. Moreover, 
we assume that G is left-recursion free and that L{G) is not empty. These 
can be assumed for the results of the paper, since there is an easy polyno- 
mial time transformation of a context-free grammar to a grammar over the 
alphabet {0, 1} that generates an isomorphic language (with respect to the 
lexicographic order) not containing e, and each grammar not generating the 
empty word can be transformed in polynomial time into an equivalent gram- 
mar that contains no useless nonterminals or e-rules or any left-recursive 
nonterminal. See [U [8] . 

We let X, y, Z (sometimes decorated) denote nonterminals, n, v, w, x, y ter- 
minal words in {0,1}*, and we let p,q,r denote words in (A^ U {0,1})*. 
For every word p, we denote by L{p) the set of all words w € {0, 1}* with 
p =^* w. Thus, the language L{G) generated by G is L{S). The length of p 
is denoted \p\. 

For nonterminals X and Y we define y ^ X iff there exist p, q with X =>* 
pYq, and we define X if both X ^Y andY ^X hold. When X ^Y, 
we say that X and Y belong to the same strong component. When Y ^ X 
but X ^ Y, we also write Y ~< X. The height of a nonterminal X is the 
length k of the longest sequence Yi ~< . . . ~< Y^ = X . When C is a strong 
component and X € C has height k, we also say that C has height k. 

A primitive word is a nonempty word that is not a proper power. For 
elementary properties of primitive words we refer to [lOj . 

Theorem 3.1 The following conditions are equivalent. 

1. {L(G), <£) is a scattered linear ordering. 

2. There exist no nonterminal X and words u,v € {0, 1}* such that nei- 
ther u is a prefix of v nor v is a prefix of u, moreover, X ^* uXp and 
X =^* vXq hold for some p, q. 

3. For each recursive nonterminal X there is a primitive word uq = Uq 
such that whenever X wXp then w Uq . 

4. For each strong component C containing a recursive nonterminal there 
is a primitive word uq = Uq, unique up to conjugacy, such that for 
all X,Y & C there is a ( necessarily unique ) conjugate vq of uq and a 
proper prefix vi of vq such that if X =>+ wYp for some w G {0, 1}* 
and p e (iV U {0, 1})* then w G VqVi . 
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Proof. It is easy to prove that the first condition imphes the second. Suppose 
that L{G) is scattered and let X uXp and X vXq. If neither u is a 
prefix of V nor v is a prefix of u, then u and v are nonempty and comparable 
with respect to the strict order, say u <s v. Suppose that S wXp. The 
vertices w{u + v)* determine an embedding of the full binary tree in T{L). 
Thus, by Proposition 12. 1|, L is quasi-dense, a contradiction. Thus, either u 
is a prefix of v or vice versa. 

Suppose now that the second condition holds. We prove that the third 
condition also holds. Let X be a recursive nonterminal and suppose that 
X =^>+ uXp. Then u is nonempty (since G is left recursion free) and thus 
has a primitive root uq. We claim that whenever X ^+ wXq then w is a 
power of uq. Indeed, if X =^~^ wXq then w is also nonempty and thus there 
exist m, n > with \u^\ = \w'^\. Since X u^Xp"' and X w"^Xq"^, 
it follows that = w^, so that uq is also the primitive root of w. 

Next we prove that the third condition implies the fourth. So assume that 
the third condition holds. Note that if a strong component contains a re- 
cursive nonterminal, then all nonterminals in that strong component are 
recursive. 

Lemma 3.2 Suppose that X, Y are different recursive nonterminals that 
belong to the same strong component. Then Uq and Uq are conjugate. 

Proof. Since X,Y belong to the same strong component, there exist x,y 
and p, q with X =^~^ xYp and Y yXq. Thus, X =^~^ xyXqp and 

Y yxYpq. Thus, xy is a power of Uq and yx is a power of Uq . Since 
xy and yx are conjugate and Uq and are primitive, this is possible only 
if Uq and u^ are conjugate. □ 

Using the lemma, we now complete the proof of the fact that the third 
condition implies the fourth. 

Suppose that the strong component C contains a recursive nonterminal and 
Xq G C. Let Uq = Uq° . For the sake of simplicity, below we will just write 
Uq for this word. Let X, Y G C with X wYp and Y xXq, where 
w,x,p,q are appropriate words, so that X wxXqp. By Lemma [3^ we 
have that wx is a power of a primitive word vq which is a conjugate of uq. It 
is clear that vq is unique. Also, w = VqVi for some n > and some proper 
prefix vi of Vq. 

We still need to show that if X w'Yp' for some w' and p' , then w' can 
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be written as v^^vi for some m. But in this case X w'xXqp' and w'x is 
a power of vq. Since the length of w' is congruent to the length of w modulo 
the length of vq, it follows that w' = v^vi for some m > 0. This ends the 
proof of the fact that the third condition implies the fourth. 

Suppose finally that the fourth condition holds. Then clearly, the third 
condition also holds. We want to prove that L{G) is scattered. To this end, 
we establish several preliminary facts. 

Definition 3.3 Suppose that X is a recursive nonterminal and let uq = Uq . 
For each n > and prefix ui of uq, where i = 0, 1, let L{X, n, ui) denote the 
set of all words of the form UqUiw in L{X), where w G {0, 1}* and i = 1 iff 
i = 0. 

Let X be a recursive nonterminal. The following facts are clear. (Below we 
continue writing uq for .) 

Proposition 3.4 Each word in L{X) is either in L{X, n, ui) for some n > 
and prefix ui of uq, or is a word of the form UqU where n > and u is a 
proper prefix of uq. 

Proposition 3.5 For each n > and prefix ui of uq there is only a finite 
number of left derivations 

X wYp UQUiq (1) 

such that u^ui is not a prefix of w. 

Let us denote by F{X^ n, ui) the finite set of all words q that occur in 
derivations ([T|). 

Proposition 3.6 IfY is a nonterminal that occurs in a word q € F{X, n, ui) 
for some n > and prefix ui of uq, then Y ^ X. 

Proof. Suppose that ([T]) is a left derivation and Y occurs in q, so that 
q = qiYq2 for some qi,q2- If X ^ Y then there exist some ri,r2 with 
Y riXr2. Thus, q = qiYq2 ^* qiriXr2q2- Let v denote a terminal 
word with qiri v. Then we have 

X u^uiq =>* UQuivXr2q2- 
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Since UqUW is not a power of uq, this contradicts the third condition. □ 
We now complete the proof of Theorem 13.11 

Let X be a nonterminal. We prove the following fact: If {L{Y), <£) is 
scattered for all nonterminals Y whose height is less than the height of X, 
then {L{X), <^) is scattered. 

If X is not a recursive nonterminal, then the height of each nonterminal 
appearing on the right side of a rule X ^ p is less than the height of X. 
Thus L(X) is the finite union of all languages L{p) where X — ?> p is in P. By 
the induction hypothesis and Proposition [2121 each linear ordering {L(p), <i) 
is scattered. Since any finite union of scattered linear orderings is scattered, 
{L{X), <i) is also scattered. 

Suppose now that X is recursive. Then by Proposition 13.41 
L(X) = LoU y L{X,n,ui) 

n>0, ui 

where ui ranges over the prefixes of uq = Uq and each word of Lq is of the 
form UqV for some n > and some proper prefix v oi uq. It is clear that Lq 
is scattered (in fact, either a finite linear ordering or an w-chain). Thus, it 
suffices to show that 

( IJ L{X,n,ui),<e) 

n>0, ui 

is scattered. But this linear ordering is isomorphic to the ordered sum 

L{X,n,ul)+ L{X,-n,uO) 

n>0, ul n<0, uO 

where in the first term ul is a prefix of uq and in the second term uO is a 
prefix of Uq. Since a scattered sum of scattered linear orderings is scattered, 
it remains to show that each L{X, n, ui) is scattered. But by Proposition [331 
for each n and ui, L{X, n, ui) is a finite union of languages of the form 
UQuiL{q) where q contains only nonterminals of height strictly less than 
the height of X. Thus, by the induction hypothesis and Proposition 12.21 
each such language is scattered. Since any finite union of scattered linear 
orderings is scattered, it follows that L{X, n, ui) is scattered. This ends the 
proof of the fact that the fourth condition implies the first. The proof of 
Theorem 13.11 is complete. □ 
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Theorem 3.7 L{G) is well-ordered iff L{G) is scattered and there is no 
recursive nonterminal X such that L{X) contains a word w such that Uq <<j 
w for some n, where uq = . 

Proof. Note that the extra condition is equivalent to that for ah recursive 
nonterminals X and for any prefix uO of no = , the language L{X, n, nO) 
is empty. Now by repeating the last part of the proof of Theorem 13.11 it 
follows that under this condition, if L{G) is scattered, then L{X) is well- 
ordered for all X. One uses the well-known fact that if a linear order is 
a finite union of well-orderings, then it is also a well-ordering, and that a 
well-ordered sum of well-orderings is well-ordered. 

On the other hand, if the extra condition is not satisfied for the recursive 
nonterminal X, then L(X) is not well-ordered. For suppose that L(X, n, uO) 
contains the word UquIx. We know that there is some m > 1 and some w 
with X =^~^ Uq^Xw. Thus, the words Uq'^UquIxv''"^ for A; = 0, 1, . . . form a 
strictly decreasing sequence in L{X). We conclude by noting that if L{X) 
is not well-ordered for some X, then L{G) is not well-ordered either, since 
G contains no useless nonterminals. □ 

At this point, we are already able to show that it is decidable whether L{G) 
is scattered, or well-ordered. 

Corollary 3.8 There exists an algorithm to decide whether L{G) is scat- 
tered. 

Proof. As before, we may assume that G = {N, {0,1}, P, S) contains no 
useless nonterminals or e-rules. Moreover, we may assume that G is left- 
recursion free and that L{G) is not empty. By Theorem 13.11 we know that 
L(G) is scattered iff for each recursive nonterminal X there is a primitive 
word Uq such that whenever X =^~^ wXp then w ^ Uq . We are going to test 
this condition. Given a recursive nonterminal X in the strong component C, 
we find a word u such that X uXp for some p. Clearly, u ^ e. Let uq 
denote the primitive root of u. Then consider the following grammar Gx- 
The nonterminals are the nonterminals of G together with the nonterminals 
Y, where Y ^ C. The rules are those of G together with the rules 

Y ^pZ 

such that Y,Z £ C and there is some q with Y — > pZq € P. There is one 
more rule, X ^ e. Let X be the start symbol. Then L[Gx) ^ iff for 
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all w such that X =^~^ wXp for some p in G, it holds that w (z Uq . Now 
L{Gx) Q Uq iff the intersection of L{Gx) with the complement of Uq is 
empty, which is decidable. □ 

Corollary 3.9 There exists an algorithm to decide whether L{G) is well- 
ordered. 

Proof. The extra condition introduced in Theorem 13.71 can be effectively 
tested, since it says that for each recursive nonterminal X, the intersection 
of L{X) with the regular language of all words of the form UquIx, where 
n > and uO is a prefix of uq, is empty. □ 



4 Decidability in exponential time 

In this section, we give somewhat more efficient algorithms. First we need 
some preparation. 

Suppose that uq G {0, 1}* is a fixed primitive word, and consider the set 

5 of all pairs (xi,X2), where xi is a proper suffix of uq and X2 is a proper 
prefix of Uq. In particular, (e,e) € S. With each (xi,X2) G 5 we associate 
the language X2) = xiUqX2, if |xiX2| < \u\, and X2) = xiUqX2 + z 
where z is the suffix of X1X2 obtained by removing its prefix of length |no|, 
if |xiX2| > \uq\. (Note that the prefix of length |no| of X1X2 is a primitive 
word which is a conjugate oi uq.) We call a word w legitimate if it belongs 
to L(xi,X2) for some (xi,X2) S S. Clearly, a word is legitimate iff it is 
a sub word of some power of uq iff it is in VqZ for some conjugate vq of 
Uq and some necessarily unique proper prefix z oi vq. Moreover, for each 
(xi, X2) € S there is a unique conjugate vq of uq and a unique proper prefix 
z of Vq with L(xi, X2) = VqZ. It follows from this fact that any two languages 
I/(xi,X2) and L(yi,7/2) for (xi,X2) 7^ (2/1,2/2) in S are either disjoint or have 
a single common element which is a proper subword of uq. In particular, 
for any legitimate word u with \u\ > \uq\ there is a unique (xi,X2) G S with 
u G L{xi,X2). 

It is also clear that any subword of a legitimate word is legitimate, and 
if li G L{xi,X2), say, and v is obtained from u by removing a subword of 
length \uo\, then u is legitimate with v E L(xi,X2). Also, if v is obtained 
by duplicating a subword of n of length \uq\ then ?; is legitimate with v G 
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L{xi,X2)- Moreover, when u,v G L{xi,X2), then \u\ is congruent to \v\ 
modulo |no|. 

Let (xi, X2), (yi, ^2) G S. Then L(xi, X2)L(yi, 7/2) contains only legitimate 
words iff X2yi G {uo,e}, in which case L(xi, X2)i(yi, ^2) ^ L{xi,y2). This 
motivates the following definition. For any (xi,X2) and {1/1,1/2) in 5, let 



so that (g) is a partial operation on S. Thus, if (xi,X2) ^ {1/1,1/2) = {zi,Z2), 
then L{xi,X2)L{yi,y2) C L{zi,Z2), moreover, (21,22) is the only element of 
S with this property. 

Now let (xi,X2) € S and consider a word y. Then L{xi,X2)y contains only 
legitimate words iff y G L{yi,y2) for some (yi,?/2) G 5 such that X2yi € 
{no,e}, in which case (xi,y2) is the unique element of S with L{xi,X2)y ^ 
L{xi,y2)- Thus we define (xi,X2) ®y = {xi,y2) if this holds, otherwise 
(xi , X2) (8> y is not defined. We define y ^ (xi , X2) symmetrically. The partial 
operation (8) is associative in a strong sense. 

Using the above notions, the fourth condition of Theorem l3.1l can be rephrased 
as follows. For each strong component C containing a recursive nonterminal 
there is a primitive word uq = Uq (unique up to conjugacy) such that for 
all X,Y £ C there is (a necessarily unique) (xi, X2) € S such that whenever 
X ^+ wYp then w G L(xi,X2). 

As before, let us assume that G = {N, {0, 1}, P, S) is a context-free grammar 
that contains no useless nonterminals or e-rules. Moreover, we assume that 
G is left-recursion free and that L{G) is not empty. 

Lemma 4.1 Suppose that the fourth condition of Theorem \3.1\ holds and let 
C be a strong component containing a recursive nonterminal. Let uq = Uq, 
and suppose that each nonterminal generates at least two terminal words. 
Then for each X such that Xq =>* pXqYr for some Xq,Y € C and words 
p,q,r there is a unique (xi,X2) G S with L{X) C L(xi,X2). 

Proof. Let (2/1,2/2) denote the unique element of S such that w £ L{yi,y2) 
whenever Xq wYs for some s. Then L{pXq) C L{yi,y2), so that 

uL{X)v C L{yi,y2) for any fixed u G L{p) and v G L{q). This is possible 
only if L{X) C L{xi,X2) for some (xi,X2) G S. Since L{X) contains at least 
two words, (xi,X2) is unique. □ 
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Theorem 4.2 Suppose that each nonterminal generates a language of at 
least two words. Then {L{G), <i) is a scattered linear ordering iff the fol- 
lowing holds for each strong component C containing a recursive nontermi- 
nal: There exists a primitive word uq such that for any two not necessarily 
different nonterminals X and Y in C there is some ip{X, y) € 5 and for 
each nonterminal Z there is some ip{Z) £ S such that 

ip{X,Y)(g,ip{Y,Z) = ^{X,Z) (2) 

for all X,Y, Z € C, and such that the following hold for all productions 
X wqYi . ..YkWk: 

1. If X £ C and Yi for some i, then 

ip{X,Yi) = WQ(^ip{Yi)(^wi(^ ...(^ip{Yi-i)(^Wi-i. (3) 

2. If there is derivation Xq pXqYr for some Xo,Y £ C, then 

il){X) = wq (S) il){Yi) ® . . . (S) i^{Yk) Wk. (4) 

(In the degenerate case when A: = in the last equation, we mean that wq 
belongs to the language L{il}[X)). 

Proof. Suppose that the conditions of the Theorem hold. Consider a strong 
component C containing a recursive nonterminal and the corresponding 
primitive word uq. Then for any X such that there is derivation Xq 
pXqYr for some Xo,Y £ C we have that L{X) C L{^{X)): 

Claim 1. Suppose that (jll holds for all appropriate rules. Then for each X 
such that there is a derivation Xq =^>* pXqYr for some Xq,Y € C it holds 
that L{X) C L{i}{X)). 

Indeed, suppose that X ^* w. We prove that w G L{iIj{X)) by induction on 
the length of the derivation. When the length of the derivation is 1, the claim 
is clear by ([H). Suppose that the length is greater than 1. Then there exist 
some rule X — )• wqYiWi . . . YkWk and words zi, . . . ,Zk with w = wqZi . . . ZkWk 
and Yi Zi for all i. By the induction hypothesis we have that each Zi is in 
L{il){Yi)). Since (j4|) holds, we conclude that w = w^zi . . . ZkWk £ L{ip{X)). 
This ends the proof of Claim 1. 

Also, for any X,Y £ C and words w and p with X =^~^ wYp we have that 
w £ L{ip{X,Y)) as shown by the following claim: 
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Claim 2. Suppose that ([2|), ([3]) and (jl]) hold. Then if X ^+ wYp, where 
X,Y £C, then w G L{ip{X,Y)). 

To see this, consider a derivation tree whose root is labeled X and whose 
frontier is wYp. Let Yi = X,Y2, . . . ,1^,1^+1 = y be all the nonterminal 
labels along the path from the root to the leaf labeled Y. Moreover, let 
Yi — )• piYi^iQi denote the rule used to rewrite Y^, lor i = 1, . . . By ([3]) we 
have that 

ip{Yi,Y2) = i^ipi), ifiYe, = i^ipe) 

where if pi = zqZi . . . ZkZk, say, then ^(pi) = zq®iIj{Zi)^ . . .(8)^(Zfc)(g)Zfc+i. 
Now let us write w = wi . . . wi with pi Wi for all i. Using Claim 1 and 
the equality 99(1^,1^+1) = Tp{pi), we obtain Wi € L{ip{Yi,Yi^i)). Since this 
holds for all i, we obtain by 1^ that w G L{ip{X, Y)). 

We conclude that the fourth condition of Theorem 13.11 holds, so that L{G) 
is scattered. 

Suppose now that L{G) is scattered. Then the fourth condition of Theo- 
rem [3TT] holds. Suppose that C is a strong component containing a recursive 
nonterminal. Let uq = Uq. By assumption, for each X,Y there exists a 
unique (xi, X2) G S such that whenever X wYp then w € X2). De- 

fine ip{X, Y) = (xi, X2). By Lemma [4. H for each X such that there is deriva- 
tion Xq pXqYr for some words p, q, r and nonterminals Xq, Y C, there 
is a unique (xi,X2) G S with L{X) C L{xi,X2)- Define tpiX) = (xi,X2). 
The pairs so defined solve the system of equations in the Theorem. □ 

Theorem 4.3 It is decidable in exponential time whether a context-free lan- 
guage generated by a context-free grammar G is scattered. 

Proof. Without loss of generality we may assume that the terminal alphabet 
is {0, 1} and that the grammar G contains no useless nonterminals or e-rules. 
Moreover, we may assume that G is left-recursion free and each nonterminal 
generates at least two terminal words. 

First, for each C containing a recursive nonterminal, one can compute in 
exponetial time a primitive word uq which is the only candidate for Uq. 
This is done by finding in exponential time a left derivation X =^~^ wXp, 
with X G C, then uq is the primitive root of w. Second, in the same way, 
for any X,Y G C, we can determine in exponential time the only candidate 
for (p{X, Y) by computing a left derivation X wYp, where the length of 
w is between \uq\ and 2|iio|. Also, we can compute in exponential time the 
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only candidate for ip{X), for all appropriate X. Then it remains to check 
that the equations of Theorem 14 . 2 1 hold . But there are a polynomial number 
of them, and the validity of each can be checked in exponential time. □ 

The same result holds for deciding whether a context-free language is well- 
ordered. 

Theorem 4.4 It is decidable in exponential time whether a context-free 
grammar generates a well-ordered language. 

Proof. Again, we may restrict the grammars as in the previous proof. The 
extra condition introduced in Theorem 13. 71 can be tested in exponential time. 
Hint: if X UqUiYp u^ulq is a left derivation, where uO is a prefix 
of no, then the length of the derivation can be bounded by a exponential. 

□ 

Remark 4.5 The algorithms given in the proofs of Theorem \4-3\ and \4-4\ 

run in polynomial time in the important special case when each nonterminal 
generates a prefix-free language of at least two words, since in that case 
whenever X pYq is a rule such that X , then u S {0, 1}*. 
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